Naloxone Access and Immunity Policies (Independent Variables)¶

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import nltk
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
import seaborn as sns
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.decomposition import TruncatedSVD
import json

import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.figure_factory as ff
/opt/conda/lib/python3.9/site-packages/geopandas/_compat.py:111: UserWarning:

The Shapely GEOS version (3.10.3-CAPI-1.16.1) is incompatible with the GEOS version PyGEOS was compiled with (3.10.4-CAPI-1.16.2). Conversions between both will be slow.

In [2]:
pdaps_path = "data/pdaps.csv"
df = pd.read_csv(pdaps_path)

Overview of Variables and Dataset¶

Dataset¶

The dataset is a survey of state implementation of nalaxone access and immunity laws over time generated by the Prescription Drug Abuse Policy System (PDAPS), a data and research hub maintained by the Center for Public Health Law Research at the Temple University Beasley School of Lawby with funding from the National Institute on Drug Abuse. This data set in particular also received support from the Opioid Research Workgroup from the University of Florida College of Pharmacy. It examines policies from January 1, 2001 to January 1, 2022.

This data was generated with the intent to map the previous and current state of naloxone prescription and dispensing regulation policies, as well as to inform legal researchers and future policy. This data is directly applicable to our research, as it contains much of the direct information about opioid overdose prevention policies that we plan to assess through this project.

The policies included in the survey include those such as "Does the Jurisdiction have a nalaxone access law?" and "Do prescribers have immunity from civil liability for prescribing, dispensing or distributing naloxone to a layperson?" Each question is represented as a row in the dataset, and the responses of "Yes/No/Not Applicable (or no Data) are coded as "1, 0, 2" in the dataset, respectively.

The raw dataset has 5 rows and 36 columns, with each row representing the policy environment in a given state during the time range specified by the Effective Date and Valid Through Date variables. Each change in time period represents an ammendment to the state's Nalaxone access and immunity laws -- For example, from 2001/01/01 to 2015/06/07, Alabama did not have a policy permitting access to Nalaxone, but the next row reflects how the law was ammended on 2015/06/08 to permit access.

A strength of the dataset is that it is quite clean (apart from the variable names that can be quickly recoded), since all the categorical variables are nicely one hot encoded. A weakness is the high proportion of null values from 2001 to around 2015--this is not intrinsic to the dataset, but rather a consequence of the fact that most states did not begin adopting naloxone immunity provisions until the early 2010s.

This data was collected ethically - state policies are public record, as are legal policies regulating medical professionals. No surveying, testing, or otherwise assessments with capacity for unethical conduct were required for the collection of this data.

In [3]:
df.head()
Out[3]:
Jurisdictions Effective Date Valid Through Date naaddressoaayn nahealthcrimproyn nanapimm1yn narcimm1yn nahealthcivliayn nanapimm2yn narcimm2yn ... pharmacist-dispensing-method_Directly authorized by legislature naimmcrimprolpyn nanapimm3yn narcimm3yn naimmcivlialpyn nanapimm4yn narcimm4yn naloxone-crimpossesion naloxone-crimpossessionprog crim-possess-reasonable
0 Alabama 2001-01-01 2015-06-07 0 2 2 2 2 2 2 ... 2 2 2 2 2 2 2 2 2 2
1 Alabama 2015-06-08 2016-05-09 1 1 0 0 1 0 0 ... 0 1 0 1 1 0 1 0 2 2
2 Alabama 2016-05-10 2022-01-01 1 1 0 0 1 0 0 ... 0 1 0 1 1 0 1 0 2 2
3 Alaska 2001-01-01 2016-03-14 0 2 2 2 2 2 2 ... 2 2 2 2 2 2 2 2 2 2
4 Alaska 2016-03-15 2017-03-21 1 0 2 2 1 0 0 ... 0 0 2 2 1 0 0 0 2 2

5 rows × 36 columns

Variables¶

Primary Questions¶

The 33 policy variables, denoted by the columns in the table, correlate to 12 primary questions and their related sub-questions. Currently, we believe that all of these variables could be pertinent within our analysis.

The first question asks if the jurisdiction has a nalaxone access law. The next type of primary immunity questions are structured as such:

  1. Does [agent] have immunity from [liability] for prescribing, dispensing, or distributing naloxone to a layperson?

The [agent]s encompassed by the survey include:

  • prescribers
  • dispensers
  • laypersons

While the [liability]s include:

  • civil liability
  • criminal liability
  • prrofessional sanctions

In the data cleaning below, these 8 questions are coded as '[agent]-[liability]-immun'

The other 3 primary questions include:

  1. Are prescriptions of Naloxone authorized to third-parties?
  1. Are pharmacists allowed to dispense or distribute naloxone without a patient-specific prescription from another medical professional?
  1. Does the law remove criminal liability for possession of naloxone without a prescription?

Subquestions¶

For primary questions 1, 2, and 4, there are subquestions that ask if the [agent] is required to

  • Participate in a naloxone administration program as a required condition of immunity
  • Act with reasonable care

For primary question 3, if the answer is Yes, the subquestions ask about the method(s) of distribution allowed by the law:

  • Standing order
  • Protocol order
  • Collaborative practice agreement
  • Pharmacist prescriptive authority
  • Directly authorized by legislature

Data Cleaning¶

The first step of the cleaning process was to recode the variable names to more clearly reflect the survey questions as described above.

Then, we reshaped the data such that each row represents a state and the policies it had in place during a given year, so that it would match the granularity of the dependent variable (deaths per state in a given year, see eda-kff file). This resulted in a dataframe with 1071 rows (51 jurisdictions * 21 years of analysis between 2001 and 2021).

The NaN values were preserved so that we could first analyze the percentage of null values before imputing them.

In [4]:
#rename columns 
new_cols = {
    'naaddressoaayn': 'nal_access_law', 
    'nahealthcrimproyn': 'presc_crim_immun', 
    'nanapimm1yn': 'presc_crim_immun_nal_program', 
    'narcimm1yn': 'presc_crim_immun_reason_care',
    'nahealthcivliayn':'presc_civil_immun', 
    'nanapimm2yn': 'presc_civil_immun_nal_program', 
    'narcimm2yn': 'presc_civil_immun_reason_care', 
    'naloxone-presprof': 'presc_profsanctions_immun',
    'naloxone-dispcrim': 'disp_crim_immun', 
    'naloxone-dcrimpro': 'disp_crim_immun_nal_program', 
    'naloxone-dcrimcare': 'disp_crim_immun_reason_care',
    'naloxone-dispciv': 'disp_civil_immun', 
    'naloxone-dcivprog':'disp_civil_immun_nal_program', 
    'naloxone-dcivcare': 'disp_civil_immun_reason_care',
    'naloxone-presprof-disp': 'disp_profsanctions_immun', 
    'naloxone-thirdparty': 'third_party_presc', 
    'naloxone-thirdprog': 'third_party_presc_nal_program',
    'naloxone-thirdcare': 'third_party_reason_care', 
    'pharmacist-dispensing': 'pharm_dispense_no_prescription',
    'pharmacist-dispensing-method_Standing_order': 'pharm_dispense-standing_order',
    'pharmacist-dispensing-method_Protocol order': 'pharm_dispense-protocol_order',
    'pharmacist-dispensing-method_Naloxone-specific collaborative practice agreement':'pharm_dispense-collab_agree',
    'pharmacist-dispensing-method_Pharmacist prescriptive authority':'pharm_dispense-presc_auth',
    'pharmacist-dispensing-method_Directly authorized by legislature':'pharm_dispense-legislature',
    'naimmcrimprolpyn': 'lay_crim_immun', 
    'nanapimm3yn':'lay_crim_immun_nal_program', 
    'narcimm3yn':'lay_crim_immun_reason_care', 
    'naimmcivlialpyn': 'lay_civil_immun',
    'nanapimm4yn': 'lay_civil_immun_nal_progam', 
    'narcimm4yn': 'lay_civil_immum_reason_care', 
    'naloxone-crimpossesion': 'possess_crim_immun',
    'naloxone-crimpossessionprog': 'possess_crim_immun_nal_program', 
    'crim-possess-reasonable': 'possess_crim_immun_reason_care'
}

df.rename(index=int, columns=new_cols, inplace=True)
In [5]:
from pandas.core.groupby.groupby import MultiIndex
from collections import defaultdict

split_dict = defaultdict(list)

#separate ranges of policy effective dates into individual years 
for ind in df.index:
  start_year, end_year = int(df['Effective Date'][ind].split('-')[0]), int(df['Valid Through Date'][ind].split('-')[0])
  state = df['Jurisdictions'][ind]

  for year in range(start_year, end_year+1):
    policies = df.iloc[ind, 3:].values
    split_dict[(year,state)]= policies

#rename columns and create new df from dict
split_cols = df.columns[3:]
df = pd.DataFrame.from_dict(split_dict, orient='index', columns = split_cols)

df.index = MultiIndex.from_tuples(df.index)
df = df.reset_index()
df.rename(columns={'level_0':'Year','level_1':'State'}, inplace=True)
df = df[df['Year'] < 2022]
In [6]:
print('data shape:', df.shape)
df.head()
df.replace(2, np.nan, inplace=True)
data shape: (1071, 35)

Description of Data¶

The histogram below shows a distribution of policy lengths. Apart from the high count of 0 values (indicating either null or no policy), we can observe a second peak at around 6-7 years. Again, this is consistent with our observation that states did not widely adopt immunity policies until around 2015. The outlier at 20 years is likely New Mexico, which was the first state to adopt a naloxone access law.

In [7]:
avg_len = df.groupby('State').agg('sum').iloc[:,1:]
avg_len.stack().plot.hist(title='Distribution of Policy Lengths (Years)')
Out[7]:
<AxesSubplot:title={'center':'Distribution of Policy Lengths (Years)'}, ylabel='Frequency'>

Handling Null Values¶

As mentioned above, most states did not adopt naloxone immunity-related policies until the early 2010s--hence, around 61% of the data is currently null values.

In [8]:
#find percentage of null values in entire dataset
nulls = df.isna().sum().sum()
nulls/df.size
Out[8]:
0.6115512871815393

Per the histogram below, most of the policy variables contain around 60-70% null values; with the nal-access-law variable being the only one with complete data.

In [9]:
null_df = df.groupby('Year').agg(lambda x: x.isna().sum()).iloc[:,1:]
annual_nulls = null_df.sum(axis = 1)
annual_totals = 51*33

policy_nulls = null_df.sum()
policy_totals = 51*21 

(policy_nulls / policy_totals).sort_values(ascending=False).plot.hist(figsize = (10,5))
Out[9]:
<AxesSubplot:ylabel='Frequency'>
In [10]:
policy_nulls.sort_values(ascending=True)
Out[10]:
nal_access_law                                   0
disp_profsanctions_immun                       637
disp_civil_immun                               637
third_party_presc                              637
presc_profsanctions_immun                      637
disp_crim_immun                                637
presc_civil_immun                              637
lay_civil_immun                                637
possess_crim_immun                             637
presc_crim_immun                               637
lay_crim_immun                                 637
pharm_dispense_no_prescription                 637
pharm_dispense-legislature                     713
pharm_dispense-presc_auth                      713
pharm_dispense-collab_agree                    713
pharm_dispense-protocol_order                  713
pharmacist-dispensing-method_Standing order    713
third_party_presc_nal_program                  714
third_party_reason_care                        714
lay_civil_immun_nal_progam                     719
lay_civil_immum_reason_care                    719
disp_civil_immun_reason_care                   734
disp_civil_immun_nal_program                   734
presc_civil_immun_reason_care                  740
presc_civil_immun_nal_program                  740
lay_crim_immun_nal_program                     763
lay_crim_immun_reason_care                     763
presc_crim_immun_reason_care                   770
presc_crim_immun_nal_program                   770
disp_crim_immun_reason_care                    775
disp_crim_immun_nal_program                    775
possess_crim_immun_nal_program                 961
possess_crim_immun_reason_care                 961
dtype: int64

Between 2012 and 2015, the density of null values per year declines sharply--from 2015 onward, less than 30% of the data is null in a given year.

In [11]:
(annual_nulls / annual_totals).plot.line(ylabel="Proportion of Nulls", title='Null Proportion per Year', grid=True)
plt.xticks(range(2000,2022,4))
plt.show()

Ultimately, we decided to simply impute null values with 0 -- given that most states did not begin imputing naloxone immunity laws until 2015, it is reasonable to equate a lack of data with a lack of a policy.

In [12]:
df.fillna(0)
Out[12]:
Year State nal_access_law presc_crim_immun presc_crim_immun_nal_program presc_crim_immun_reason_care presc_civil_immun presc_civil_immun_nal_program presc_civil_immun_reason_care presc_profsanctions_immun ... pharm_dispense-legislature lay_crim_immun lay_crim_immun_nal_program lay_crim_immun_reason_care lay_civil_immun lay_civil_immun_nal_progam lay_civil_immum_reason_care possess_crim_immun possess_crim_immun_nal_program possess_crim_immun_reason_care
0 2001 Alabama 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 2002 Alabama 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 2003 Alabama 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 2004 Alabama 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 2005 Alabama 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1116 2017 Wyoming 1 1.0 0.0 1.0 1.0 0.0 1.0 1.0 ... 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
1117 2018 Wyoming 1 1.0 0.0 1.0 1.0 0.0 1.0 1.0 ... 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
1118 2019 Wyoming 1 1.0 0.0 1.0 1.0 0.0 1.0 1.0 ... 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
1119 2020 Wyoming 1 1.0 0.0 1.0 1.0 0.0 1.0 1.0 ... 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
1120 2021 Wyoming 1 1.0 0.0 1.0 1.0 0.0 1.0 1.0 ... 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0

1071 rows × 35 columns

Visualizations¶

In the following visualizations, we selected the 12 Primary Policy questions (as defined in the previous section) for analysis, and denoted them within the immun-pols variable

  1. In 2021, all states and DC had some sort of naloxone access law, and almost all states allow pharmacists to dispense without a patient specific prescription. The least common policy by far is the provision of immunity from criminal liability for possession of naloxone without a prescription; with only 15 states providing immunity.
In [13]:
immun_pols = ['nal_access_law','presc_crim_immun','presc_civil_immun',
              'presc_profsanctions_immun','disp_crim_immun','disp_civil_immun',
              'disp_profsanctions_immun', 'third_party_presc','pharm_dispense_no_prescription',
              'lay_crim_immun','lay_civil_immun', 'possess_crim_immun']
df21 = df[(df['Year']==2021)].loc[:,immun_pols]

pols_2021 = pd.DataFrame()
pols_2021['Yes'] = df21.sum()
pols_2021['No'] = df21.apply(lambda x: x==0).sum()

pols_2021.sort_values(by='Yes', ascending=False).plot.bar(stacked=True, figsize = (10,5), color = {'Yes':'#479cd1', 'No':'#ffe330'})

plt.xlabel('Policy', fontsize=12)
plt.ylabel('Number of States', fontsize=12)
plt.title('Adoption of Primary Policies in 2021', fontsize=16)
plt.show()
In [14]:
law1_count = df.groupby('Year').agg('sum', numeric_only=True).loc[:,immun_pols]
law1_count.plot.line(figsize = (10, 5), grid = True, xlabel = "Year", ylabel = "Num States", title='Adoption of Policies Over Time')
plt.xticks(range(2000,2022,4))
plt.show()
  1. In the choropleth map below, we can observe that New Mexico began adopting naloxone immunity policies almost 15 years before most other states did. From 2015 onward, states began rapidly adopting the primary policies; often ratifying several of them within a single amendment year. By 2021, North Dakota, Texas, Nevada, Wisconsin, and Louisiana had all 12 policies in place, while Oklahoma had only 3.
In [15]:
us_state_to_abbrev = {
    "Alabama": "AL",
    "Alaska": "AK",
    "Arizona": "AZ",
    "Arkansas": "AR",
    "California": "CA",
    "Colorado": "CO",
    "Connecticut": "CT",
    "Delaware": "DE",
    "Florida": "FL",
    "Georgia": "GA",
    "Hawaii": "HI",
    "Idaho": "ID",
    "Illinois": "IL",
    "Indiana": "IN",
    "Iowa": "IA",
    "Kansas": "KS",
    "Kentucky": "KY",
    "Louisiana": "LA",
    "Maine": "ME",
    "Maryland": "MD",
    "Massachusetts": "MA",
    "Michigan": "MI",
    "Minnesota": "MN",
    "Mississippi": "MS",
    "Missouri": "MO",
    "Montana": "MT",
    "Nebraska": "NE",
    "Nevada": "NV",
    "New Hampshire": "NH",
    "New Jersey": "NJ",
    "New Mexico": "NM",
    "New York": "NY",
    "North Carolina": "NC",
    "North Dakota": "ND",
    "Ohio": "OH",
    "Oklahoma": "OK",
    "Oregon": "OR",
    "Pennsylvania": "PA",
    "Rhode Island": "RI",
    "South Carolina": "SC",
    "South Dakota": "SD",
    "Tennessee": "TN",
    "Texas": "TX",
    "Utah": "UT",
    "Vermont": "VT",
    "Virginia": "VA",
    "Washington": "WA",
    "West Virginia": "WV",
    "Wisconsin": "WI",
    "Wyoming": "WY",
    "District of Columbia": "DC",
}
In [16]:
cp_df = df.loc[:, ['State', 'Year']]
totals = df.loc[:, immun_pols].sum(axis=1)
cp_df['Policies'] = totals
cp_df.replace({'State':us_state_to_abbrev}, inplace=True)
cp_df
Out[16]:
State Year Policies
0 AL 2001 0.0
1 AL 2002 0.0
2 AL 2003 0.0
3 AL 2004 0.0
4 AL 2005 0.0
... ... ... ...
1116 WY 2017 11.0
1117 WY 2018 11.0
1118 WY 2019 11.0
1119 WY 2020 11.0
1120 WY 2021 11.0

1071 rows × 3 columns

In [18]:
import plotly.express as px
fig = px.choropleth(cp_df, 
              locations = 'State',
              color= 'Policies', 
              animation_frame="Year",
              color_continuous_scale="Sunset",
              locationmode='USA-states',
              scope="usa",
              range_color=(0, 12),
              title='Number of Primary Policies Implemented by Year',
              height=600
             )
fig.show(renderer="notebook")
In [19]:
corr_matrix = df[immun_pols].corr()
sns.heatmap(corr_matrix, annot=True)
plt.show()